[Relay] GradientCell Relay Pass #5039

hypercubestart · 2020-03-11T04:58:26Z

Add GradientCell relay pass which introduces the GradCell datatype. This pass can delay memory allocation and potentially improve memory usage and performance by delaying instantiations of zero-filled/one-filled tensors until necessary.

MarisaKirisame · 2020-03-11T05:18:40Z

had you rebased?
you should ask some ppl to help review. since you are new I will help you this time.
@junrushao1994 @slyubomirsky @altanh @wweic can you guys help review?

MarisaKirisame · 2020-03-11T16:04:12Z

Just to be clear, this pass save 50% memory for reverse mode ad.

src/relay/pass/gradient.cc

tests/python/relay/test_pass_gradient_cell.py

src/ir/type_functor.cc

src/relay/transforms/gradient_cell.cc

MarisaKirisame · 2020-03-23T16:46:25Z

src/relay/transforms/gradient_cell.cc

+* module must have TypeDefinition of GradCell (defined in gradient.rly)
+*/
+Constructor getGradCellConstructor(IRModule module, std::string name_hint) {
+  TypeData gradCell = module->LookupTypeDef("GradCell");


can you refactor this function into module?

yeah, added it as GetConstructor function although it may be more appropriate to be a separate PR

python/tvm/relay/transform/transform.py

jroesch · 2020-03-23T23:37:04Z

python/tvm/relay/transform/transform.py

@@ -219,6 +219,18 @@ def DeadCodeElimination(inline_once=False):
    """
    return _ffi_api.DeadCodeElimination(inline_once)

+def GradientCell():


Is gradient cell really the right name for this given it is just making 1/0s lazy.

renamed to LazyGradientInit, highlight that this pass should be used after reverse mode ad to lazily initiate tensor gradients

src/relay/transforms/gradient_cell.cc

tests/python/relay/test_pass_gradient_cell.py

jroesch

I left some style comments on the PR, please fix then I'll merge.

jwfromm · 2020-03-23T23:46:40Z

It would be great if you can add some information on why this pass is useful / needed. I understand that its important to conserve memory when computing gradients, but converting all tensors to a new type just lazily evaluate constant fills seems somewhat drastic. Can you give some numbers on the impact of this pass for typical functions?

hypercubestart · 2020-03-24T00:08:10Z

@jwfromm this pass should be used in conjunction with the gradient pass. Since tensors for reverse-mode ad are instantiated as 0-filled tensors but are actually not useful during the forward pass, this PR should reduce memory allocation by 50% during the forward pass. I don't have any data on this but mathematically this would make sense

hypercubestart · 2020-03-24T18:41:46Z

thanks @jroesch !

* save * gradient.rly * fix * NOT WORKING: gradient cell pass * test gradient pass * fixed basic call ops * more tests * fix bug * transform calls to one ones_like zero zero_like * maintenance stuff * fix linting * linting * linting * throw default * remove unrelated changes * import gradent.rly in pass * comment * linting * remove changes to test files * move gradient_cell.cc to transforms * revert change * update files with new commits * type * wrapper function to main outermost function type * fix linting * fix unsigned and signed int comparison * review * GetConstructor definition in module and change op comparison * update node instantiations * increase code readability Co-authored-by: Marisa Kirisame <lolisa@marisa.moe>

hypercubestart force-pushed the ad branch from e406694 to b12308d Compare March 11, 2020 05:29

hypercubestart changed the title ~~GradientCell Relay Pass~~ [Relay] GradientCell Relay Pass Mar 11, 2020

MarisaKirisame reviewed Mar 12, 2020

View reviewed changes

src/relay/pass/gradient.cc Outdated Show resolved Hide resolved

tests/python/relay/test_pass_gradient_cell.py Outdated Show resolved Hide resolved

MarisaKirisame reviewed Mar 12, 2020

View reviewed changes

src/ir/type_functor.cc Outdated Show resolved Hide resolved

hypercubestart force-pushed the ad branch 2 times, most recently from 1783a45 to 807c8e8 Compare March 12, 2020 06:24

hypercubestart force-pushed the ad branch from 136ffac to 3fd4367 Compare March 21, 2020 02:27

MarisaKirisame reviewed Mar 23, 2020

View reviewed changes

src/relay/transforms/gradient_cell.cc Outdated Show resolved Hide resolved

src/relay/transforms/gradient_cell.cc Outdated Show resolved Hide resolved

MarisaKirisame approved these changes Mar 23, 2020

View reviewed changes

MarisaKirisame reviewed Mar 23, 2020

View reviewed changes

MarisaKirisame and others added 18 commits March 23, 2020 11:39

save

3a929be

gradient.rly

f68955e

fix

dfb00ee

NOT WORKING: gradient cell pass

75ca326

test gradient pass

7113720

fixed basic call ops

16d63d7

more tests

a083450

fix bug

67a6b01

transform calls to one ones_like zero zero_like

949aa2b

maintenance stuff

ca15729

fix linting

182fbdc

linting

109b288

linting

9115710

throw default

8c7f4e8

remove unrelated changes

bbd0a45

import gradent.rly in pass

921a03c

comment

02563fb

linting

e81d0bd

hypercubestart added 6 commits March 23, 2020 11:39

wrapper function to main outermost function type

c614857

fix linting

0955681

fix unsigned and signed int comparison

40e629c

review

4d4b350

GetConstructor definition in module and change op comparison

953791b

update node instantiations

3e18cde

hypercubestart force-pushed the ad branch from 2f35ee8 to 3e18cde Compare March 23, 2020 19:05